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DETAILED ACTION 

Continued Examination Under 37 CFR 1.114 

1 . A request for continued examination under 37 CFR 1.114, including the 
fee set forth in 37 CFR 1 .17(e), was filed in this application after final rejection. 
Since this application is eligible for continued examination under 37 CFR 1.114, 
and the fee set forth in 37 CFR 1 .1 7(e) has been timely paid, the finality of the 
previous Office action has been withdrawn pursuant to 37 CFR 1 .1 14. 
Applicant's submission filed on 2/13/2009 has been entered. 

Remarks 

2. Receipt of Applicant's Amendment, filed on 1 2/1 3/2009, is acknowledged. 
The amendment includes the cancellation of claims 8, 12, 24, 29, 34, and 39-47, 
and the amending of claims 1 , 11, 1 7, 28, and 39. 

Claim Rejections - 35 USC § 103 

3. The following is a quotation of 35 U.S.C. 1 03(a) which forms the basis for 
all obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described 
as set forth in section 102 of this title, if the differences between the subject matter sought to 
be patented and the prior art are such that the subject matter as a whole would have been 
obvious at the time the invention was made to a person having ordinary skill in the art to which 
said subject matter pertains. Patentability shall not be negatived by the manner in which the 
invention was made. 

4. Claims 1-7,11,13-16,1 7-23, 27-28, 30-34, and 37-38 are rejected under 
35 U.S.C. 103(a) as being unpatentable over Bellegarda et al. (Article entitled 
"Exploiting Latent Semantic Information in Statistical Language Modeling, dated 
1 0/26/2000) and in view of Oliver et al. (U.S. Patent 7,1 58,986). 

5. Regarding claim 1 , Bellegarda teaches a method comprising: 

A) mapping the files into a semantic vector space (Page 1279, Abstract); 

B) clustering the files within said space (Page 1279, Abstract). 

C) wherein multiple threshold values that are settable to desired levels of 
granularity are defined^ and said files are clustered based on said multiple 
threshold values (Page 1284) 
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The examiner notes that Bellegarda teaches "mapping the files into a 
semantic vector space" as "(discrete) words and documents are mapped onto 
a (continuous) semantic vector space, in which familiar clustering techniques can 
be applied. This leads to the specification of a powerful framework for automatic 
semantic classification, as well as the derivation of several language model 
families with various smoothing properties" (Page 1279, Abstract). The examiner 
further notes that Bellegarda teaches "clustering the files within said space" 
as "(discrete) words and documents are mapped onto a (continuous) semantic 
vector space, in which familiar clustering techniques can be applied. This leads 
to the specification of a powerful framework for automatic semantic classification, 
as well as the derivation of several language model families with various 
smoothing properties" (Page 1279, Abstract). The examiner further notes that 
Bellegarda teaches "wherein multiple threshold values that are settable to 
desired levels of granularity are definedi and said files are clustered based 
on said multiple threshold values" as "Once (1 1) is specified, it is 
straightforward to proceed with the clustering of the word vectors , using any of a 
variety of algorithms (see, for instance, [2]). Since the number of such vectors is 
relatively large, it is advisable to perform this clustering in stages, using, for 
example, K-means and bottom-up clustering sequentially. In that case, K-means 
clustering is used to obtain a coarse partition of the vocabulary in to a small set 
of superclusters. Each supercluster is then itself partitioned using bottom-up 
clustering, resulting in a final set of clusters Ck, 1<=k<=K, . This process can be 
thought of as uncovering, in a data-driven fashion, a particular layer of semantic 
knowledge in the space" (1284). 

Bellegarda does not explicitly teach: 

D) deriving a hierarchy of plural level of clusters from said clustering; 

E) displaying the files in a hierarchical format of plural level of clusters based on 
said derived hierarchy. 

Oliver, however, teaches "deriving a hierarchy of plural level of 
clusters from said clustering" as "In the preferred embodiment of the 
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invention, the recommendation software uses a statistical process referred to 
herein as document clustering to group together those documents of the client 
document server that have been viewed by the user according to their common 
themes and concepts. For each individual user, the recommendation software 
clusters those documents that have the most themes and concepts in common 
with one another into interest folders 505. In the preferred embodiment, the 
recommendation software continually monitors each user and continually 
updates the user's interest folders and profile" (Column 12, lines 44-54) and "In 
the preferred embodiment of the present invention, the recommendation software 
uses a proprietary clustering algorithm to form the user interest folders. The 
clustering algorithm uses the textual content of the documents viewed by a user, 
in combination with structural information about the document server, and 
ancillary information about the user to determine the interest folders for a user" 
(Column 13, lines 10-16), and "displaying the files in a hierarchical format of 
plural level of clusters based on said derived hierarchy" as "One significant 
feature of the clustering algorithm used by the invention is that the output of the 
algorithm can be readily viewed and understood. Each document cluster (interest 
folder) is described by the most relevant keywords of the documents within the 
document cluster 510. This feature enables both users and marketers to 
understand and control the degree of personalization and targeting that is made" 
(Column 1 3, lines 22-28) and "FIG. 6 is an example of a user profile 600 
generated by the recommendation software, according to the preferred 
embodiment of the present invention. The profile shown in the personalized Web 
page of FIG. 6 comprises two different interest folders 602, 604 for a user of an 
on-line auction Web site. Each interest folder contains pages which are 
intrinsically similar to one another and dissimilar to pages in other interest 
folders. A specific interest folder contains a set of links 610 to auctions the user 
has viewed that are related to the theme of the interest folder. An interest folder 
can also include additional information including but not limited to information 
regarding the history of the user's Internet viewing, recommendations for the 
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user, a summary of the user's purchases. In the example illustrated in FIG. 6, 
each interest folder also has an associated set of keywords 612 that summarize 
the most important concepts of the particular interest folder, as determined by the 
recommendation software" (Column 14, lines 30-46). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 2, Bellegarda does not explicitly teach a method 
comprising: 

A) wherein the step of clustering the files is performed as a background routine 
during the operation of a computer associated with said file system. 

Oliver, however, teaches "wherein the step of clustering the files is 
performed as a background routine during the operation of a computer 
associated with said file system" as "In the preferred embodiment, the 
recommendation software continually monitors each user and continually 
updates the user's interest folders and profile" (Column 12, lines 52-54). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 3, Bellegarda further teaches a method comprising: 
A) wherein the step of clustering the files is performed in response to the 
creation of a new file within the file system (Page 1286, Section: A. Framework 
Extension). 



Application/Control Number: 10/644,815 Page 6 

Art Unit: 2168 

The examiner notes that Bellegarda teaches "wherein the step of 
clustering the files is performed in response to the creation of a new file 
within the file system" as "finding a new representation for a new document in 
the space S is straightforward" (Page 1286, Section: A. Framework Extension). 
The examiner further notes that it is clear that the method of Bellegarda clusters 
when a new document is noticed. 

Regarding claim 4, Bellegarda further teaches a method comprising: 

A) wherein said files are text documents (Page 1279, Abstract); and 

B) said mapping is conducted on the basis of a language model (Page 1279, 
Abstract). 

The examiner notes that Bellegarda teaches "wherein said files are text 
documents" as "This paper focuses on the use of latent semantic analysis, a 
paradigm that automatically uncovers the salient semantic relationships between 
words and documents in a given corpus" (Page 1279, Abstract). The examiner 
further notes that Bellegarda teaches "said mapping is conducted on the 
basis of a language model" as "(discrete) words and documents are mapped 
onto a (continuous) semantic vector space, in which familiar clustering 
techniques can be applied. This leads to the specification of a powerful 
framework for automatic semantic classification, as well as the derivation of 
several language model families with various smoothing properties" (Page 1279, 
Abstract). 

Regarding claim 5, Bellegarda further teaches a method comprising: 

A) wherein said mapping step comprises the steps of constructing a matrix 
which associates each word in the documents with a vector (Page 1281, Section: 
A. Feature Extraction, Section: B. Singular Value Decomposition); and 

B) associates each document with a vector (Page 1281 , Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition). 



Application/Control Number: 10/644,815 Page 7 

Art Unit: 2168 

The examiner notes that Bellegarda teaches "wherein said mapping 
step comprises the steps of constructing a matrix which associates each 
word in the documents with a vector" as "The starting point is the construction 
of a matrix (W) of co-occurrences between words and documents" (Page 1281 , 
Section: A. Feature Extraction) and "The (M x N) word-document matrix W 
resulting from the above feature extraction defines two vector representations for 
the words and the documents. Each word w/ can be uniquely associated with a 
row vector of dimension N, and each document dj can be uniquely associated 
with a column vector of dimension M(Page 1281, Section: B. Singular Value 
Decomposition). The examiner further notes that Bellegarda teaches 
"associates each document with a vector" as "The {M x N) word-document 
matrix W resulting from the above feature extraction defines two vector 
representations for the words and the documents. Each word co/ can be uniquely 
associated with a row vector of dimension N, and each document dj can be 
uniquely associated with a column vector of dimension M" (Page 1281, Section: 
B. Singular Value Decomposition). 

Regarding claim 6, Bellegarda further teaches a method comprising: 
A) the step of decomposing said matrix to define the words and documents as 
vectors in a continuous vector space (Page 1281 , Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "the step of decomposing 
said matrix to define the words and documents as vectors in a continuous 
vector space" as "To address these issues, it is useful to employ a singular 
value decomposition (SVD), a technique closely related to eigenvector 
decomposition and factor analysis" (Page 1281, Section: B. Singular Value 
Decomposition). 

Regarding claim 7, Bellegarda further teaches a method comprising: 
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A) wherein said clustering is performed by identifying documents whose vectors 
are within a threshold distance of one another (Page 1284, Section: A. Word 
Clustering). 

The examiner notes that Bellegarda teaches "wherein said clustering is 
performed by identifying documents whose vectors are within a threshold 
distance of one another" as "This opens up the opportunity to apply familiar 
clustering techniques in S, as long as a distance measure consistent with the 
SVD formalism is defined on the vector space" (Page 1286, Section: A. 
Framework Extension). 

Regarding claim 1 1 , Bellegarda teaches a graphical user interface 
comprising: 

A) a virtual file system (Page 1279, Abstract); and 

B) clustering said files based on multiple threshold values that are settable to 
desired levels of granularity (Page 1284). 

The examiner notes that Bellegarda teaches "a virtual file system with 
a semantic hierarchy, wherein the semantic hierarchy is based on 
clustering of files based on semantic similarities" as "(discrete) words and 
documents are mapped onto a (continuous) semantic vector space, in which 
familiar clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the 
derivation of several language model families with various smoothing properties" 
(Page 1279, Abstract). The examiner further notes that Bellegarda teaches 
"clustering said files based on multiple threshold values that are settable 
to desired levels of granularity " as "Once (1 1) is specified, it is straightforward 
to proceed with the clustering of the word vectors , using any of a variety of 
algorithms (see, for instance, [2]). Since the number of such vectors is relatively 
large, it is advisable to perform this clustering in stages, using, for example, K- 
means and bottom-up clustering sequentially. In that case, K-means clustering is 
used to obtain a coarse partition of the vocabulary in to a small set of 
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superclusters. Each supercluster is then itself partitioned using bottom-up 
clustering, resulting in a final set of clusters Ck, 1<=k<=K, . This process can be 
thought of as uncovering, in a data-driven fashion, a particular layer of semantic 
knowledge in the space" (1284). 

Bellegarda does not explicitly teach: 
A) A graphical user interface configured to display files with a semantic hierarchy 
of plural levels of clusters that is derived from semantic similarities of said files; 
C) determining a directory structure having plural levels of clusters based on the 
clustering determined from similarities between said files, wherein the graphical 
user interface graphically presents the determined directory structure having 
plural levels of clusters to be displayed on a display device. 

Oliver, however, teaches "A graphical user interface configured to 
display files with a semantic hierarchy of plural levels of clusters that is 
derived from semantic similarities of said files" as "In the preferred 
embodiment of the invention, the recommendation software uses a statistical 
process referred to herein as document clustering to group together those 
documents of the client document server that have been viewed by the user 
according to their common themes and concepts. For each individual user, the 
recommendation software clusters those documents that have the most themes 
and concepts in common with one another into interest folders 505. In the 
preferred embodiment, the recommendation software continually monitors each 
user and continually updates the user's interest folders and profile" (Column 12, 
lines 44-54) and "In the preferred embodiment of the present invention, the 
recommendation software uses a proprietary clustering algorithm to form the 
user interest folders. The clustering algorithm uses the textual content of the 
documents viewed by a user, in combination with structural information about the 
document server, and ancillary information about the user to determine the 
interest folders for a user" (Column 13, lines 10-16), and "determining a 
directory structure having plural levels of clusters based on the clustering 
determined from similarities between said files, wherein the graphical user 
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interface graphically presents the determined directory structure having 
plural levels of clusters to be displayed on a display device" as "In the 

preferred embodiment of the invention, the recommendation software uses a 
statistical process referred to herein as document clustering to group together 
those documents of the client document server that have been viewed by the 
user according to their common themes and concepts. For each individual user, 
the recommendation software clusters those documents that have the most 
themes and concepts in common with one another into interest folders 505. In 
the preferred embodiment, the recommendation software continually monitors 
each user and continually updates the user's interest folders and profile" (Column 
12, lines 44-54), "In the preferred embodiment of the present invention, the 
recommendation software uses a proprietary clustering algorithm to form the 
user interest folders. The clustering algorithm uses the textual content of the 
documents viewed by a user, in combination with structural information about the 
document server, and ancillary information about the user to determine the 
interest folders for a user" (Column 13, lines 10-16), "One significant feature of 
the clustering algorithm used by the invention is that the output of the algorithm 
can be readily viewed and understood. Each document cluster (interest folder) is 
described by the most relevant keywords of the documents within the document 
cluster 510. This feature enables both users and marketers to understand and 
control the degree of personalization and targeting that is made" (Column 13, 
lines 22-28) and "FIG. 6 is an example of a user profile 600 generated by the 
recommendation software, according to the preferred embodiment of the present 
invention. The profile shown in the personalized Web page of FIG. 6 comprises 
two different interest folders 602, 604 for a user of an on-line auction Web site. 
Each interest folder contains pages which are intrinsically similar to one another 
and dissimilar to pages in other interest folders. A specific interest folder 
contains a set of links 610 to auctions the user has viewed that are related to the 
theme of the interest folder. An interest folder can also include additional 
information including but not limited to information regarding the history of the 
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user's Internet viewing, recommendations for the user, a summary of the user's 
purchases. In the example illustrated in FIG. 6, each interest folder also has an 
associated set of keywords 612 that summarize the most important concepts of 
the particular interest folder, as determined by the recommendation software" 
(Column 14, lines 30-46). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 13, Bellegarda does not explicitly teach a graphical user 
interface comprising: 

A) wherein clustering of the files is initiated by user selection. 

Oliver, however, teaches "wherein clustering of the files is initiated by 
user selection" as "While the present invention is designed to automatically 
match users with relevant content, it is recognized that a client might wish to 
customize the manner in which users receive special promotions, event 
announcements and special news items. In the example of the Roman coin 
collector, a marketer of cruises might wish to target the collector with a promotion 
for a cruise of the Mediterranean" (Column 15, lines 23-29). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 14, Bellegarda further teaches a graphical user interface 
comprising: 
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A) wherein clustering of the files is initiated upon creation of a new file in the file 
system (Page 1286, Section: A. Framework Extension). 

The examiner notes that Bellegarda teaches "wherein clustering of the 
files is initiated upon creation of a new file in the file system" as "finding a 
new representation for a new document in the space S is straightforward" (Page 
1286, Section: A. Framework Extension). The examiner further notes that it is 
clear that the method of Bellegarda clusters when a new document is noticed. 

Regarding claim 15, Bellegarda further teaches a graphical user interface 
comprising: 

A) wherein text files are clustered utilizing a language model (Page 1279, 
Abstract). 

The examiner notes that Bellegarda teaches "analyzing files in a file 
system to determine similarities in data pertaining to their content" as 

"(discrete) words and documents are mapped onto a (continuous) semantic 
vector space, in which familiar clustering techniques can be applied. This leads 
to the specification of a powerful framework for automatic semantic classification, 
as well as the derivation of several language model families with various 
smoothing properties" (Page 1279, Abstract). 
Bellegarda does not explicitly teach: 

B) non-text files are clustered utilizing rule-based techniques. 

Oliver, however, teaches "non-text files are clustered utilizing rule- 
based techniques" as "The marketing system sends the recommended 
document(s), or a link to the recommended document(s) back to the client's 
document server 430. The recommendations can include but are not limited to 
URLs, product numbers, advertisements, products, animations, graphic displays, 
sound files, and applets that are selected, based on the user profile, to be 
interesting and relevant to the user. For example, the most relevant ad for any 
page can be rapidly determined by comparing the current user profile with the 
description of the available advertisements" (Column 1 1 , lines 36-45) and "While 
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the present invention is designed to automatically match users with relevant 
content, it is recognized that a client might wish to customize the manner in 
which users receive special promotions, event announcements and special news 
items. In the example of the Roman coin collector, a marketer of cruises might 
wish to target the collector with a promotion for a cruise of the Mediterranean" 
(Column 15, lines 23-29). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 16, Bellegarda further teaches a graphical user interface 
comprising: 

A) wherein said language model comprises the LSA paradigm (Page 1281, 
Section: D. Organization). 

The examiner notes that Bellegarda teaches "wherein said language 
model comprises the LSA paradigm" as "The focus of this paper is on 
semantically driven span extension only, and more specifically on how the LSA 
paradigm can be exploited to improve statistical language modeling" (Page 1281, 
Section: D. Organization). 

Regarding claim 17, Bellegarda teaches a computer-readable media 
comprising: 

A) analyzing files in a file system to determine similarities in data pertaining to 
their content (Page 1279, Abstract); 

B) clustering said files based on multiple threshold values that are settable to 
desired levels of granularity (1284); 

The examiner notes that Bellegarda teaches "analyzing files in a file 
system to determine similarities in data pertaining to their content" as 
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"(discrete) words and documents are mapped onto a (continuous) semantic 
vector space, in which familiar clustering techniques can be applied. This leads 
to the specification of a powerful framework for automatic semantic classification, 
as well as the derivation of several language model families with various 
smoothing properties" (Page 1279, Abstract). The examiner further notes that 
Bellegarda teaches "clustering said files based on multiple threshold 
values that are settable to desired levels of granularity " as "Once (1 1 ) is 
specified, it is straightforward to proceed with the clustering of the word vectors , 
using any of a variety of algorithms (see, for instance, [2]). Since the number of 
such vectors is relatively large, it is advisable to perform this clustering in stages, 
using, for example, K-means and bottom-up clustering sequentially. In that case, 
K-means clustering is used to obtain a coarse partition of the vocabulary in to a 
small set of superclusters. Each supercluster is then itself partitioned using 
bottom-up clustering, resulting in a final set of clusters Ck, 1<=k<=K, . This 
process can be thought of as uncovering, in a data-driven fashion, a particular 
layer of semantic knowledge in the space" (1284). 
Bellegarda does not explicitly teach: 

C) determining a directory structure having plural levels of clusters based on the 
clustering determined from similarities between the files; 

D) displaying files in hierarchical format of plural levels of clusters based on the 
clustering determined from similarities between the files. 

Oliver, however, teaches "determining a directory structure having 
plural levels of clusters based on the clustering determined from 
similarities between the files" as "In the preferred embodiment of the invention, 
the recommendation software uses a statistical process referred to herein as 
document clustering to group together those documents of the client document 
server that have been viewed by the user according to their common themes and 
concepts. For each individual user, the recommendation software clusters those 
documents that have the most themes and concepts in common with one 
another into interest folders 505. In the preferred embodiment, the 
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recommendation software continually monitors each user and continually 
updates the user's interest folders and profile" (Column 12, lines 44-54), "In the 
preferred embodiment of the present invention, the recommendation software 
uses a proprietary clustering algorithm to form the user interest folders. The 
clustering algorithm uses the textual content of the documents viewed by a user, 
in combination with structural information about the document server, and 
ancillary information about the user to determine the interest folders for a user" 
(Column 13, lines 10-16), "One significant feature of the clustering algorithm 
used by the invention is that the output of the algorithm can be readily viewed 
and understood. Each document cluster (interest folder) is described by the most 
relevant keywords of the documents within the document cluster 510. This 
feature enables both users and marketers to understand and control the degree 
of personalization and targeting that is made" (Column 13, lines 22-28) and "FIG. 
6 is an example of a user profile 600 generated by the recommendation software, 
according to the preferred embodiment of the present invention. The profile 
shown in the personalized Web page of FIG. 6 comprises two different interest 
folders 602, 604 for a user of an on-line auction Web site. Each interest folder 
contains pages which are intrinsically similar to one another and dissimilar to 
pages in other interest folders. A specific interest folder contains a set of links 
610 to auctions the user has viewed that are related to the theme of the interest 
folder. An interest folder can also include additional information including but not 
limited to information regarding the history of the user's Internet viewing, 
recommendations for the user, a summary of the user's purchases. In the 
example illustrated in FIG. 6, each interest folder also has an associated set of 
keywords 612 that summarize the most important concepts of the particular 
interest folder, as determined by the recommendation software" (Column 14, 
lines 30-46), and "displaying files in hierarchical format of plural levels of 
clusters based on the clustering determined from similarities between the 
files" as "One significant feature of the clustering algorithm used by the invention 
is that the output of the algorithm can be readily viewed and understood. Each 
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document cluster (interest folder) is described by the most relevant keywords of 
the documents within the document cluster 510. This feature enables both users 
and marketers to understand and control the degree of personalization and 
targeting that is made" (Column 13, lines 22-28) and "FIG. 6 is an example of a 
user profile 600 generated by the recommendation software, according to the 
preferred embodiment of the present invention. The profile shown in the 
personalized Web page of FIG. 6 comprises two different interest folders 602, 
604 for a user of an on-line auction Web site. Each interest folder contains pages 
which are intrinsically similar to one another and dissimilar to pages in other 
interest folders. A specific interest folder contains a set of links 610 to auctions 
the user has viewed that are related to the theme of the interest folder. An 
interest folder can also include additional information including but not limited to 
information regarding the history of the user's Internet viewing, recommendations 
for the user, a summary of the user's purchases. In the example illustrated in 
FIG. 6, each interest folder also has an associated set of keywords 612 that 
summarize the most important concepts of the particular interest folder, as 
determined by the recommendation software" (Column 14, lines 30-46). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 18, Bellegarda further teaches a computer-readable 
media comprising: 

A) wherein said files are text documents (Page 1279, Abstract); and 

B) the similarities are based upon the word content of the files (Page 1 281 , 
Section: A. Feature Extraction, Section: B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "wherein said files are text 
documents" as "This paper focuses on the use of latent semantic analysis, a 



Application/Control Number: 10/644,815 Page 
Art Unit: 2168 

paradigm that automatically uncovers the salient semantic relationships between 
words and documents in a given corpus" (Page 1279, Abstract). The examiner 
further notes that Bellegarda teaches "the similarities are based upon the 
word content of the files" as "The starting point is the construction of a matrix 
(W) of co-occurrences between words and documents" (Page 1281, Section: A. 
Feature Extraction) and "The (Mx N) word-document matrix W resulting from the 
above feature extraction defines two vector representations for the words and the 
documents. Each word co/ can be uniquely associated with a row vector of 
dimension N, and each document afycan be uniquely associated with a column 
vector of dimension M (Page 1 281 , Section: B. Singular Value Decomposition). 

Regarding claim 19, Bellegarda further teaches a computer-readable 
media comprising: 

A) wherein said similarities are determined in accordance with a language model 
(Page 1279, Abstract, Page 1281, Section: D. Organization); and 

B) the files are clustered in accordance with said model (Page 1279, Abstract, 
Page 1281 , Section: D. Organization). 

The examiner notes that Bellegarda teaches "wherein said similarities 
are determined in accordance with a language model" as "(discrete) words 
and documents are mapped onto a (continuous) semantic vector space, in which 
familiar clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the 
derivation of several language model families with various smoothing properties" 
(Page 1279, Abstract). The examiner further notes that Bellegarda teaches "the 
files are clustered in accordance with said model" as "(discrete) words and 
documents are mapped onto a (continuous) semantic vector space, in which 
familiar clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the 
derivation of several language model families with various smoothing properties" 
(Page 1279, Abstract). 
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Regarding claim 20, Bellegarda further teaches a computer-readable 
media comprising: 

A) wherein said language model comprises the LSA paradigm (Page 1281, 
Section: D. Organization). 

The examiner notes that Bellegarda teaches "wherein said language 
model comprises the LSA paradigm" as "The focus of this paper is on 
semantically driven span extension only, and more specifically on how the LSA 
paradigm can be exploited to improve statistical language modeling" (Page 1281, 
Section: D. Organization). 

Regarding claim 21, Bellegarda further teaches a computer-readable 
media comprising: 

A) wherein said computer-executable code performs the steps of constructing a 
matrix which associates each word in the documents with a vector (Page 1281 , 
Section: A. Feature Extraction, Section: B. Singular Value Decomposition); 
and 

B) associates each document with a vector (Page 1281 , Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "wherein said computer- 
executable code performs the steps of constructing a matrix which 
associates each word in the documents with a vector" as "The starting point 
is the construction of a matrix (W) of co-occurrences between words and 
documents" (Page 1281 , Section: A. Feature Extraction) and "The (Mx N) 
word-document matrix W resulting from the above feature extraction defines two 
vector representations for the words and the documents. Each word co/ can be 
uniquely associated with a row vector of dimension N, and each document d) can 
be uniquely associated with a column vector of dimension M(Page 1281, 
Section: B. Singular Value Decomposition). The examiner further notes that 
Bellegarda teaches "associates each document with a vector" as "The (Mx 
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N) word-document matrix W resulting from the above feature extraction defines 
two vector representations for the words and the documents. Each word co/ can 
be uniquely associated with a row vector of dimension N, and each document c/y 
can be uniquely associated with a column vector of dimension M" (Page 1281, 
Section: B. Singular Value Decomposition). 

Regarding claim 22, Bellegarda further teaches a computer-readable 
media comprising: 

A) wherein said computer-executable code further performs step of 
decomposing said matrix to define the words and documents as vectors in a 
continuous vector space (Page 1281 , Section: A. Feature Extraction, Section: 
B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "wherein said computer- 
executable code further performs step of decomposing said matrix to 
define the words and documents as vectors in a continuous vector space" 
as "To address these issues, it is useful to employ a singular value 
decomposition (SVD), a technique closely related to eigenvector decomposition 
and factor analysis" (Page 1 281 , Section: B. Singular Value Decomposition). 

Regarding claim 23, Bellegarda further teaches a computer-readable 
media comprising: 

A) wherein said computer-executable code performs clustering by identifying 
documents whose vectors are within a threshold distance of one another (Page 
1284, Section: A. Word Clustering). 

The examiner notes that Bellegarda teaches "wherein said computer- 
executable code performs clustering by identifying documents whose 
vectors are within a threshold distance of one another" as "This opens up 
the opportunity to apply familiar clustering techniques in S, as long as a distance 
measure consistent with the SVD formalism is defined on the vector space" 
(Page 1286, Section: A. Framework Extension). 
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Regarding claim 27, Bellegarda further teaches a computer-readable 
media comprising: 

A) wherein the computer executable code performs the following steps: 
clustering text files within the file system using semantic similarities (Page 1279, 
Abstract). 

The examiner notes that Bellegarda teaches "a semantic hierarchy that 
is based upon the content of said files" as "(discrete) words and documents 
are mapped onto a (continuous) semantic vector space, in which familiar 
clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the 
derivation of several language model families with various smoothing properties" 
(Page 1279, Abstract). 

Bellegarda does not explicitly teach: 

B) clustering non-text files within the files system using rule-based techniques; 

C) labeling the resulting clusters; and 

D) displaying the files in a hierarchical format based on the resulting clusters and 
labels. 

Olliver, however, teaches "clustering non-text files within the files 
system using rule-based techniques" as "The marketing system sends the 
recommended document(s), or a link to the recommended document(s) back to 
the client's document server 430. The recommendations can include but are not 
limited to URLs, product numbers, advertisements, products, animations, graphic 
displays, sound files, and applets that are selected, based on the user profile, to 
be interesting and relevant to the user. For example, the most relevant ad for any 
page can be rapidly determined by comparing the current user profile with the 
description of the available advertisements" (Column 11, lines 36-45) and "While 
the present invention is designed to automatically match users with relevant 
content, it is recognized that a client might wish to customize the manner in 
which users receive special promotions, event announcements and special news 
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items. In the example of the Roman coin collector, a marketer of cruises might 
wish to target the collector with a promotion for a cruise of the Mediterranean" 
(Column 15, lines 23-29), "labeling the resulting clusters" as "FIG. 6 is an 
example of a user profile 600 generated by the recommendation software, 
according to the preferred embodiment of the present invention. The profile 
shown in the personalized Web page of FIG. 6 comprises two different interest 
folders 602, 604 for a user of an on-line auction Web site. Each interest folder 
contains pages which are intrinsically similar to one another and dissimilar to 
pages in other interest folders. A specific interest folder contains a set of links 
610 to auctions the user has viewed that are related to the theme of the interest 
folder. An interest folder can also include additional information including but not 
limited to information regarding the history of the user's Internet viewing, 
recommendations for the user, a summary of the user's purchases. In the 
example illustrated in FIG. 6, each interest folder also has an associated set of 
keywords 612 that summarize the most important concepts of the particular 
interest folder, as determined by the recommendation software" (Column 14, 
lines 30-46), and "displaying the files in a hierarchical format based on the 
resulting clusters and labels" as "One significant feature of the clustering 
algorithm used by the invention is that the output of the algorithm can be readily 
viewed and understood. Each document cluster (interest folder) is described by 
the most relevant keywords of the documents within the document cluster 510. 
This feature enables both users and marketers to understand and control the 
degree of personalization and targeting that is made" (Column 13, lines 22-28) 
and "FIG. 6 is an example of a user profile 600 generated by the 
recommendation software, according to the preferred embodiment of the present 
invention. The profile shown in the personalized Web page of FIG. 6 comprises 
two different interest folders 602, 604 for a user of an on-line auction Web site. 
Each interest folder contains pages which are intrinsically similar to one another 
and dissimilar to pages in other interest folders. A specific interest folder 
contains a set of links 610 to auctions the user has viewed that are related to the 
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theme of the interest folder. An interest folder can also include additional 
information including but not limited to information regarding the history of the 
user's Internet viewing, recommendations for the user, a summary of the user's 
purchases. In the example illustrated in FIG. 6, each interest folder also has an 
associated set of keywords 612 that summarize the most important concepts of 
the particular interest folder, as determined by the recommendation software" 
(Column 14, lines 30-46). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 28, Bellegarda teaches a computer system comprising: 
A) a file system storing files (Page 1279, Abstract); 

C) a processor for analyzing the content of files stored in said file system to map 
said files into a semantic vector space, cluster the files within said space based 
on multiple threshold values that are settable to desired levels of granularity 
(Pages 1279 and 1284, Abstract); 

The examiner notes that Bellegarda teaches "a file system storing 
files" as "(discrete) words and documents are mapped onto a (continuous) 
semantic vector space, in which familiar clustering techniques can be applied. 
This leads to the specification of a powerful framework for automatic semantic 
classification, as well as the derivation of several language model families with 
various smoothing properties" (Page 1279, Abstract). The examiner further 
notes that Bellegarda teaches "a processor for analyzing the content of files 
stored in said file system to map said files into a semantic vector space, 
cluster the files within said space based on multiple threshold values that 
are settable to desired levels of granularity " as "(discrete) words and 
documents are mapped onto a (continuous) semantic vector space, in which 



Application/Control Number: 10/644,815 Page 
Art Unit: 2168 

familiar clustering techniques can be applied. This leads to the specification of a 
powerful framework for automatic semantic classification, as well as the 
derivation of several language model families with various smoothing properties" 
(Page 1279, Abstract) and "Once (1 1) is specified, it is straightforward to proceed 
with the clustering of the word vectors , using any of a variety of algorithms (see, 
for instance, [2]). Since the number of such vectors is relatively large, it is 
advisable to perform this clustering in stages, using, for example, K-means and 
bottom-up clustering sequentially. In that case, K-means clustering is used to 
obtain a coarse partition of the vocabulary in to a small set of superclusters. Each 
supercluster is then itself partitioned using bottom-up clustering, resulting in a 
final set of clusters Ck, 1 <=k<=K, . This process can be thought of as uncovering, 
in a data-driven fashion, a particular layer of semantic knowledge in the space" 
(1284). 

Bellegarda does not explicitly teach: 
B) a display device; and 

D) derive a hierarchy of plural levels of clusters from said clustering; 

E) a user interface which displays representations of files stored in said file 
system in the form of said derived hierarchy of plural level of clusters. 

Oliver, however, teaches "a display device" as "FIG. 6 is an example of 
a user profile 600 generated by the recommendation software, according to the 
preferred embodiment of the present invention. The profile shown in the 
personalized Web page of FIG. 6 comprises two different interest folders 602, 
604 for a user of an on-line auction Web site. Each interest folder contains pages 
which are intrinsically similar to one another and dissimilar to pages in other 
interest folders. A specific interest folder contains a set of links 61 0 to auctions 
the user has viewed that are related to the theme of the interest folder. An 
interest folder can also include additional information including but not limited to 
information regarding the history of the user's Internet viewing, recommendations 
for the user, a summary of the user's purchases. In the example illustrated in 
FIG. 6, each interest folder also has an associated set of keywords 612 that 
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summarize the most important concepts of the particular interest folder, as 
determined by the recommendation software" (Column 14, lines 30-46), "derive 
a hierarchy of plural levels of clusters from said clustering" as "In the 
preferred embodiment of the invention, the recommendation software uses a 
statistical process referred to herein as document clustering to group together 
those documents of the client document server that have been viewed by the 
user according to their common themes and concepts. For each individual user, 
the recommendation software clusters those documents that have the most 
themes and concepts in common with one another into interest folders 505. In 
the preferred embodiment, the recommendation software continually monitors 
each user and continually updates the user's interest folders and profile" (Column 
12, lines 44-54) and "In the preferred embodiment of the present invention, the 
recommendation software uses a proprietary clustering algorithm to form the 
user interest folders. The clustering algorithm uses the textual content of the 
documents viewed by a user, in combination with structural information about the 
document server, and ancillary information about the user to determine the 
interest folders for a user" (Column 13, lines 10-16), and "a user interface 
which displays representations of files stored in said file system in the 
form of said derived hierarchy of plural level of clusters" as "One significant 
feature of the clustering algorithm used by the invention is that the output of the 
algorithm can be readily viewed and understood. Each document cluster (interest 
folder) is described by the most relevant keywords of the documents within the 
document cluster 510. This feature enables both users and marketers to 
understand and control the degree of personalization and targeting that is made" 
(Column 13, lines 22-28) and "FIG. 6 is an example of a user profile 600 
generated by the recommendation software, according to the preferred 
embodiment of the present invention. The profile shown in the personalized Web 
page of FIG. 6 comprises two different interest folders 602, 604 for a user of an 
on-line auction Web site. Each interest folder contains pages which are 
intrinsically similar to one another and dissimilar to pages in other interest 
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folders. A specific interest folder contains a set of links 610 to auctions the user 
has viewed that are related to the theme of the interest folder. An interest folder 
can also include additional information including but not limited to information 
regarding the history of the user's Internet viewing, recommendations for the 
user, a summary of the user's purchases. In the example illustrated in FIG. 6, 
each interest folder also has an associated set of keywords 612 that summarize 
the most important concepts of the particular interest folder, as determined by the 
recommendation software" (Column 14, lines 30-46). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 30, Bellegarda further teaches a computer system 
comprising: 

A) wherein said files are text documents (Page 1279, Abstract); and 

B) said processor maps said files on the basis of a language model (Page 1279, 
Abstract). 

The examiner notes that Bellegarda teaches "wherein said files are text 
documents" as "This paper focuses on the use of latent semantic analysis, a 
paradigm that automatically uncovers the salient semantic relationships between 
words and documents in a given corpus" (Page 1279, Abstract). The examiner 
further notes that Bellegarda teaches "said processor maps said files on the 
basis of a language model" as "(discrete) words and documents are mapped 
onto a (continuous) semantic vector space, in which familiar clustering 
techniques can be applied. This leads to the specification of a powerful 
framework for automatic semantic classification, as well as the derivation of 
several language model families with various smoothing properties" (Page 1279, 
Abstract). 
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Regarding claim 31 , Bellegarda further teaches a computer system 
comprising: 

A) wherein said processor constructs a matrix which associates each word in the 
documents with a vector (Page 1281 , Section: A. Feature Extraction, Section: 

B. Singular Value Decomposition); and 

B) associates each document with a vector (Page 1281, Section: A. Feature 
Extraction, Section: B. Singular Value Decomposition). 

The examiner notes that Bellegarda teaches "wherein said processor 
constructs a matrix which associates each word in the documents with a 
vector" as "The starting point is the construction of a matrix (W) of co- 
occurrences between words and documents" (Page 1281, Section: A. Feature 
Extraction) and "The (Mx N) word-document matrix 14/ resulting from the above 
feature extraction defines two vector representations for the words and the 
documents. Each word co/ can be uniquely associated with a row vector of 
dimension N, and each document c/ycan be uniquely associated with a column 
vector of dimension M (Page 1 281 , Section: B. Singular Value Decomposition). 
The examiner further notes that Bellegarda teaches "associates each 
document with a vector" as "The (Mx N) word-document matrix W resulting 
from the above feature extraction defines two vector representations for the 
words and the documents. Each word co/ can be uniquely associated with a row 
vector of dimension N, and each document d, can be uniquely associated with a 
column vector of dimension M" (Page 1281, Section: B. Singular Value 
Decomposition). 

Regarding claim 32, Bellegarda further teaches a computer-readable 
media comprising: 

A) wherein said processor further decomposes said matrix to define the words 
and documents as vectors in a continuous vector space (Page 1281 , Section: A. 
Feature Extraction, Section: B. Singular Value Decomposition). 
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The examiner notes that Bellegarda teaches "wherein said processor 
further decomposes said matrix to define the words and documents as 
vectors in a continuous vector space" as "To address these issues, it is useful 
to employ a singular value decomposition (SVD), a technique closely related to 
eigenvector decomposition and factor analysis" (Page 1 281 , Section: B. 
Singular Value Decomposition). 

Regarding claim 33, Bellegarda further teaches a computer system 
comprising: 

A) wherein said processor clusters the files by identifying documents whose 
vectors are within a threshold distance of one another (Page 1284, Section: A. 
Word Clustering). 

The examiner notes that Bellegarda teaches "wherein said processor 
clusters the files by identifying documents whose vectors are within a 
threshold distance of one another" as "This opens up the opportunity to apply 
familiar clustering techniques in S, as long as a distance measure consistent with 
the SVD formalism is defined on the vector space" (Page 1286, Section: A. 
Framework Extension). 

Regarding claim 37, Bellegarda does not explicitly teach a method 
comprising: 

A) wherein said deriving step includes organizing the clusters into a hierarchical 
directory structure. 

Oliver, however, teaches "wherein said deriving step includes 
organizing the clusters into a hierarchical directory structure" as "In the 

preferred embodiment of the invention, the recommendation software uses a 
statistical process referred to herein as document clustering to group together 
those documents of the client document server that have been viewed by the 
user according to their common themes and concepts. For each individual user, 
the recommendation software clusters those documents that have the most 
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themes and concepts in common with one another into interest folders 505. In 
the preferred embodiment, the recommendation software continually monitors 
each user and continually updates the user's interest folders and profile" (Column 
12, lines 44-54), "In the preferred embodiment of the present invention, the 
recommendation software uses a proprietary clustering algorithm to form the 
user interest folders. The clustering algorithm uses the textual content of the 
documents viewed by a user, in combination with structural information about the 
document server, and ancillary information about the user to determine the 
interest folders for a user" (Column 13, lines 10-16), "One significant feature of 
the clustering algorithm used by the invention is that the output of the algorithm 
can be readily viewed and understood. Each document cluster (interest folder) is 
described by the most relevant keywords of the documents within the document 
cluster 510. This feature enables both users and marketers to understand and 
control the degree of personalization and targeting that is made" (Column 13, 
lines 22-28) and "FIG. 6 is an example of a user profile 600 generated by the 
recommendation software, according to the preferred embodiment of the present 
invention. The profile shown in the personalized Web page of FIG. 6 comprises 
two different interest folders 602, 604 for a user of an on-line auction Web site. 
Each interest folder contains pages which are intrinsically similar to one another 
and dissimilar to pages in other interest folders. A specific interest folder 
contains a set of links 610 to auctions the user has viewed that are related to the 
theme of the interest folder. An interest folder can also include additional 
information including but not limited to information regarding the history of the 
user's Internet viewing, recommendations for the user, a summary of the user's 
purchases. In the example illustrated in FIG. 6, each interest folder also has an 
associated set of keywords 612 that summarize the most important concepts of 
the particular interest folder, as determined by the recommendation software" 
(Column 14, lines 30-46). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
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teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

Regarding claim 38, Bellegarda teaches a method comprising: 

A) mapping all words of the plurality of documents and the plurality of 
documents in a semantic vector space (Page 1279, Abstract); 

B) generating a plurality of clusters based on the semantic similarities of the 
plurality of documents and multiple threshold values that are settable to desired 
levels of granularity (Pages 1279 and 1284, Abstract). 

The examiner notes that Bellegarda teaches "mapping all words of the 
plurality of documents and the plurality of documents in a semantic vector 
space" as "(discrete) words and documents are mapped onto a (continuous) 
semantic vector space, in which familiar clustering techniques can be applied. 
This leads to the specification of a powerful framework for automatic semantic 
classification, as well as the derivation of several language model families with 
various smoothing properties" (Page 1279, Abstract). The examiner further 
notes that Bellegarda teaches "generating a plurality of clusters based on 
the semantic similarities of the plurality of documents and multiple 
threshold values that are settable to desired levels of granularity " as 
"(discrete) words and documents are mapped onto a (continuous) semantic 
vector space, in which familiar clustering techniques can be applied. This leads 
to the specification of a powerful framework for automatic semantic classification, 
as well as the derivation of several language model families with various 
smoothing properties" (Page 1279, Abstract) and "Once (11) is specified, it is 
straightforward to proceed with the clustering of the word vectors , using any of a 
variety of algorithms (see, for instance, [2]). Since the number of such vectors is 
relatively large, it is advisable to perform this clustering in stages, using, for 
example, K-means and bottom-up clustering sequentially. In that case, K-means 
clustering is used to obtain a coarse partition of the vocabulary in to a small set 
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of superclusters. Each supercluster is then itself partitioned using bottom-up 
clustering, resulting in a final set of clusters Ck, 1<=k<=K, . This process can be 
thought of as uncovering, in a data-driven fashion, a particular layer of semantic 
knowledge in the space" (1284). 

Bellegarda does not explicitly teach: 

C) organizing the plurality of clusters into directories in a hierarchical format of 
plural levels of clusters; 

D) displaying the plurality of documents in said hierarchical format of plural 
levels of clusters based on a result of clustering the plurality of documents. 

Oliver, however, teaches "organizing the plurality of clusters into 
directories in a hierarchical format of plural levels of clusters" as "In the 
preferred embodiment of the invention, the recommendation software uses a 
statistical process referred to herein as document clustering to group together 
those documents of the client document server that have been viewed by the 
user according to their common themes and concepts. For each individual user, 
the recommendation software clusters those documents that have the most 
themes and concepts in common with one another into interest folders 505. In 
the preferred embodiment, the recommendation software continually monitors 
each user and continually updates the user's interest folders and profile" (Column 
12, lines 44-54) and "In the preferred embodiment of the present invention, the 
recommendation software uses a proprietary clustering algorithm to form the 
user interest folders. The clustering algorithm uses the textual content of the 
documents viewed by a user, in combination with structural information about the 
document server, and ancillary information about the user to determine the 
interest folders for a user" (Column 13, lines 10-16), and "displaying the 
plurality of documents in said hierarchical format of plural levels of 
clusters based on a result of clustering the plurality of documents" as "One 
significant feature of the clustering algorithm used by the invention is that the 
output of the algorithm can be readily viewed and understood. Each document 
cluster (interest folder) is described by the most relevant keywords of the 
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documents within the document cluster 510. This feature enables both users and 
marketers to understand and control the degree of personalization and targeting 
that is made" (Column 13, lines 22-28) and "FIG. 6 is an example of a user profile 
600 generated by the recommendation software, according to the preferred 
embodiment of the present invention. The profile shown in the personalized Web 
page of FIG. 6 comprises two different interest folders 602, 604 for a user of an 
on-line auction Web site. Each interest folder contains pages which are 
intrinsically similar to one another and dissimilar to pages in other interest 
folders. A specific interest folder contains a set of links 610 to auctions the user 
has viewed that are related to the theme of the interest folder. An interest folder 
can also include additional information including but not limited to information 
regarding the history of the user's Internet viewing, recommendations for the 
user, a summary of the user's purchases. In the example illustrated in FIG. 6, 
each interest folder also has an associated set of keywords 612 that summarize 
the most important concepts of the particular interest folder, as determined by the 
recommendation software" (Column 14, lines 30-46). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Oliver's would have allowed Bellegarda's to provide a method o 
provide automated organization for users, as noted by Oliver (Column 3, lines 
23-32). 

6. Claims 9-1 0, 25-26, and 35-36 are rejected under 35 U.S.C. 1 03(a) as 
being unpatentable over Bellegarda et al. (Article entitled "Exploiting Latent 
Semantic Information in Statistical Language Modeling, dated 10/26/2000) and in 
view of Oliver et al. (U.S. Patent 7,158,986) as applied to claims 1-7, 11, 13-16, 
17-23, 27-28, 30-34, and 37-38 and further in view of Kusama (U.S. Patent 
7,085,767). 

7. Regarding claim 9, Bellegarda and Oliver do not explicitly teach a 
method comprising: 
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A) including the step of automatically labeling the clusters based on the resulting 
clusters. 

Kusama, however teaches "including the step of automatically 
labeling the clusters based on the resulting clusters" as "the "Title" of 
"cardinfo.xml" is read, and the folder having the same name as the meta data 
being saved in the "Title" are generated at a predetermined location in the binary 
data storage device. According to this processing, in the case where this, for 
example, the meta data "cardinfo.xml" depicted in FIG. 10, then the folder having 
the name of "Party" which is written in the "Title" is generated" (Column 5, lines 
46-53). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Kusama's would have allowed Bellegarda's and Oliver's to provide a 
method for users to find and file documents in multiple locations in order to 
generate/copy data by automatically devising a folder name in order to lessen the 
burden of having to conform to the content of the data, as noted by Kusama 
(Column 1, lines 34-38). 

Regarding claim 10, Bellegarda and Oliver do not explicitly teach a 
method comprising: 

A) wherein said labeling comprises selecting representative words based on the 
closeness of their vectors to the document vectors in a cluster. 

Kusama, however teaches "wherein said labeling comprises selecting 
representative words based on the closeness of their vectors to the 
document vectors in a cluster" as "the "Title" of "cardinfo.xml" is read, and the 
folder having the same name as the meta data being saved in the "Title" are 
generated at a predetermined location in the binary data storage device. 
According to this processing, in the case where this, for example, the meta data 
"cardinfo.xml" depicted in FIG. 10, then the folder having the name of "Party" 
which is written in the "Title" is generated" (Column 5, lines 46-53). 
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It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Kusama's would have allowed Bellegarda's and Oliver's to provide a 
method for users to find and file documents in multiple locations in order to 
generate/copy data by automatically devising a folder name in order to lessen the 
burden of having to conform to the content of the data, as noted by Kusama 
(Column 1, lines 34-38). 

Regarding claim 25, Bellegarda and Oliver do not explicitly teach a 
computer-readable media comprising: 

A) wherein said computer-executable code performs step of automatically 
labeling the clusters based on the resulting clusters. 

Kusama, however teaches "wherein said computer-executable code 
performs step of automatically labeling the clusters based on the resulting 
clusters" as "the "Title" of "cardinfo.xml" is read, and the folder having the same 
name as the meta data being saved in the "Title" are generated at a 
predetermined location in the binary data storage device. According to this 
processing, in the case where this, for example, the meta data "cardinfo.xml" 
depicted in FIG. 10, then the folder having the name of "Party" which is written in 
the "Title" is generated" (Column 5, lines 46-53). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Kusama's would have allowed Bellegarda's and Oliver's to provide a 
method for users to find and file documents in multiple locations in order to 
generate/copy data by automatically devising a folder name in order to lessen the 
burden of having to conform to the content of the data, as noted by Kusama 
(Column 1, lines 34-38). 

Regarding claim 26, Bellegarda and Oliver do not explicitly teach a 
computer-readable media comprising: 
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A) wherein said labeling comprises selecting representative words based on the 
closeness of their vectors to the document vectors in a cluster. 

Kusama, however teaches "wherein said labeling comprises selecting 
representative words based on the closeness of their vectors to the 
document vectors in a cluster" as "the "Title" of "cardinfo.xml" is read, and the 
folder having the same name as the meta data being saved in the "Title" are 
generated at a predetermined location in the binary data storage device. 
According to this processing, in the case where this, for example, the meta data 
"cardinfo.xml" depicted in FIG. 10, then the folder having the name of "Party" 
which is written in the "Title" is generated" (Column 5, lines 46-53). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Kusama's would have allowed Bellegarda's and Oliver's to provide a 
method for users to find and file documents in multiple locations in order to 
generate/copy data by automatically devising a folder name in order to lessen the 
burden of having to conform to the content of the data, as noted by Kusama 
(Column 1, lines 34-38). 

Regarding claim 35, Bellegarda and Oliver do not explicitly teach a 
computer system comprising: 

A) wherein said processor automatically labels the clusters based on the 
resulting clusters. 

Kusama, however teaches "wherein said processor automatically 
labels the clusters based on the resulting clusters" as "the "Title" of 
"cardinfo.xml" is read, and the folder having the same name as the meta data 
being saved in the "Title" are generated at a predetermined location in the binary 
data storage device. According to this processing, in the case where this, for 
example, the meta data "cardinfo.xml" depicted in FIG. 10, then the folder having 
the name of "Party" which is written in the "Title" is generated" (Column 5, lines 
46-53). 
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It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Kusama's would have allowed Bellegarda's and Oliver's to provide a 
method for users to find and file documents in multiple locations in order to 
generate/copy data by automatically devising a folder name in order to lessen the 
burden of having to conform to the content of the data, as noted by Kusama 
(Column 1, lines 34-38). 

Regarding claim 36, Bellegarda and Oliver do not explicitly teach a 
computer system comprising: 

A) wherein said processor labels the clusters by selecting representative words 
based on the closeness of their vectors to the document vectors in a cluster. 

Kusama, however teaches "wherein said processor labels the clusters 
by selecting representative words based on the closeness of their vectors 
to the document vectors in a cluster" as "the "Title" of "cardinfo.xml" is read, 
and the folder having the same name as the meta data being saved in the "Title" 
are generated at a predetermined location in the binary data storage device. 
According to this processing, in the case where this, for example, the meta data 
"cardinfo.xml" depicted in FIG. 10, then the folder having the name of "Party" 
which is written in the "Title" is generated" (Column 5, lines 46-53). 

It would have been obvious to one of ordinary skill in the art at the time the 
invention was made to combine the teachings of the cited references because 
teaching Kusama's would have allowed Bellegarda's and Oliver's to provide a 
method for users to find and file documents in multiple locations in order to 
generate/copy data by automatically devising a folder name in order to lessen the 
burden of having to conform to the content of the data, as noted by Kusama 
(Column 1, lines 34-38). 

Response to Arguments 

8. Applicant's arguments filed 02/13/2009 have been fully considered but 
they are not persuasive. 
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Applicant argues on page 1 1 that "the Bellegarda article would not have 
taught or suggested, among other features, clustering files within said 
space, wherein multiple threshold values that are settable to desired levels 
of granularity are define, and said files are clustered based on said multiple 
threshold values; deriving a plural levels of clusters from said clustering, 
as recited in claim 1". However, the examiner wishes to refer to page 1284 of 
Bellegarda which "Once (1 1 ) is specified, it is straightforward to proceed with the 
clustering of the word vectors , using any of a variety of algorithms (see, for 
instance, [2]). Since the number of such vectors is relatively large, it is advisable 
to perform this clustering in stages, using, for example, K-means and bottom-up 
clustering sequentially. In that case, K-means clustering is used to obtain a 
coarse partition of the vocabulary in to a small set of superclusters. Each 
supercluster is then itself partitioned using bottom-up clustering, resulting in a 
final set of clusters Ck, 1 <=k<=K, . This process can be thought of as uncovering, 
in a data-driven fashion, a particular layer of semantic knowledge in the space" 
(1284). The examiner further wishes to state that because Bellegarda teaches 
the argued "superclusters", then as a result, Bellegarda teaches the equivalently 
claimed multiple threshold values. Moreover, because the combination of 
Bellegarda and Olliver teaches displaying the hierarchical relationships, then as a 
result, the aforementioned limitation is taught. Moreover, because the formula 
with which the superclusters are based on are over a settable range of 
"Ck<=k<=K, then as a result, a user can set to whatever granularity he chooses 
to". In addition, Applicant's arguments fail to comply with 37 CFR 1 .1 1 1 (b) 
because they amount to a general allegation that the claims define a patentable 
invention without specifically pointing out how the language of the claims 
patentably distinguishes them from the references. 

Applicants argue on page 1 1 that "The Bellegarda article does not 
mention settable thresholds, and further it does not disclose files being 
clustered based on said multiple threshold vales". However, the examiner 
wishes to refer to page 1 284 of Bellegarda which "Once (1 1 ) is specified, it is 
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straightforward to proceed with the clustering of the word vectors , using any of a 
variety of algorithms (see, for instance, [2]). Since the number of such vectors is 
relatively large, it is advisable to perform this clustering in stages, using, for 
example, K-means and bottom-up clustering sequentially. In that case, K-means 
clustering is used to obtain a coarse partition of the vocabulary in to a small set 
of superclusters. Each supercluster is then itself partitioned using bottom-up 
clustering, resulting in a final set of clusters Ck, 1<=k<=K, . This process can be 
thought of as uncovering, in a data-driven fashion, a particular layer of semantic 
knowledge in the space" (1 284). The examiner further wishes to state that 
because Bellegarda teaches the argued "superclusters", then as a result, 
Bellegarda teaches the equivalently claimed multiple threshold values. 
Moreover, because the combination of Bellegarda and Olliver teaches displaying 
the hierarchical relationships, then as a result, the aforementioned limitation is 
taught. Moreover, because the formula with which the superclusters are based 
on are over a settable range of "Ck<=k<=K, then as a result, a user can set to 
whatever granularity he chooses to". Moreover, because the superclusters 
cluster files, then applicants arguments are simply incorrect. Both Bellegarda 
and the claimed invention are directed to the same scope of clustering. 

Conclusion 

9. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

U.S. Patent 6,820,094 issued to Ferguson et al. on 16 November 2004. 
The subject matter disclosed therein is pertinent to that of claims 1-11, 1 3-23, 25- 
28, 30-33, and 35-38 (e.g., methods to use to smart folders to automatically 
organize and relate relevant files). 

U.S. PGPUB 2004/0249865 issued to Lee et al. on 09 December 2004. 
The subject matter disclosed therein is pertinent to that of claims 1-11, 1 3-23, 25- 
28, 30-33, and 35-38 (e.g., methods to automatically name and label folders). 
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U.S. PGPUB 2004/0148453 issued to Watanabe et al. on 29 July 2004. 
The subject matter disclosed therein is pertinent to that of claims 1-11, 1 3-23, 25- 
28, 30-33, and 35-38 (e.g., methods to automatically name and label folders). 

U.S. Patent 5,819,258 issued to Vaithyanathan et al. on 06 October 
1 998. The subject matter disclosed therein is pertinent to that of claims 1-11, 13- 
23, 25-28, 30-33, and 35-38 (e.g., methods to use to smart folders to 
automatically organize and relate relevant files). 

U.S. Patent 6,360,227 issued to Aggarwal et al. on 19 March 2002. The 
subject matter disclosed therein is pertinent to that of claims 1-11, 1 3-23, 25-28, 
30-33, and 35-38 (e.g., methods to use to smart folders to automatically organize 
and relate relevant files). 

U.S. Patent 5,899,995 issued to Millier et al. on 04 May 1999. The 
subject matter disclosed therein is pertinent to that of claims 1-11, 1 3-23, 25-28, 
30-33, and 35-38 (e.g., methods to use to smart folders to automatically organize 
and relate relevant files). 

Contact Information 

1 0. Any inquiry concerning this communication or earlier communications from 
the examiner should be directed to Mahesh Dwivedi whose telephone number is 
(571 ) 272-2731 . The examiner can normally be reached on Monday to Friday 
8:20 am -4:40 pm. 

If attempts to reach the examiner by telephone are unsuccessful, the 
examiner's supervisor, Tim Vo can be reached (571) 272-3642. The fax number 
for the organization where this application or proceeding is assigned is (571) 
273-8300. 

Information regarding the status of an application may be obtained from 
the Patent Application Information Retrieval (PAIR) system. Status information 
for published applications may be obtained from either Private PAIR or Public 
PAIR. Status information for unpublished applications is available through 
Private PAIR only. For more information about the PAIR system, see http://pair- 
direct.uspto.gov . Should you have questions on access to the Private PAIR 
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system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll- 
free). 
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Patent Examiner 
Art Unit 2168 

April 21, 2009 
/Mahesh H Dwivedi/ 
Examiner, Art Unit 2168 

/Tim T. Vol 

Supervisory Patent Examiner, Art Unit 2168 



